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ABSTRACT 

Non-gravitational processes, such as feedback from galaxies and their active nuclei, are believed 
to have injected excess entropy into the intracluster gas, and therefore to have modified the density 
profiles in galaxy clusters during their formation. Here we study a simple model for this so-called 
preheating scenario, and ask (i) whether it can simultaneously explain both global X-ray scaling 
relations and number counts of galaxy clusters, and (ii) whether the amount of entropy required 
evolves with redshift. We adopt a baseline entropy profile that fits recent hydrodynamic simulations, 
modify the hydrostatic equilibrium condition for the gas by including «20% non-thermal pressure 
support, and add an entropy floor Kq that is allowed to vary with redshift. We find that the observed 
luminosity-temperature {L — T) relations of low-redshift ((z) = 0.05) HIFLUGCS clusters and high- 
redshift ((z) — 0.80) WARPS clusters are best simultaneously reproduced with an evolving entropy 
floor of Kq{z) = 341(1 + z)~^'^^h~^^^'keV cm^. If we restrict our analysis to the subset of bright 
(kT ^ 3 keV) clusters, we find that the evolving entropy floor can mimic a self-similar evolution 
in the L — T scaling relation. This degeneracy with self-similar evolution is, however, broken when 
(0.5 ^ kT <, 3 keV) clusters are also included. The ~ 60% entropy increase we find from z = 0.8 to 
z — 0.05 is roughly consistent with that expected if the heating is provided by the evolving global 
quasar population. Using the cosmological parameters from the WMAP 3-year data with erg = 0.76, 
our best-fit model underpredicts the number counts of the X-ray galaxy clusters compared to those 
derived from the 158deg^ ROSAT PSPC survey. Treating erg as a free parameter, we find a best-fit 
value of as = 0.80 ± 0.02, in good agreement with the results from a recent combined analysis of the 
Lyman-a forest, 3D weak lensing and WMAP 3-year data. For the flux-limited cluster catalogs, we 
include an intrinsic scatter in log-luminosity at both fixed temperature {<JinL\T ~ 0.3) and at fixed 
mass {(TinL\M ~ 0.6), but we find this does not have a big effect on our results. 
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1. INTRODUCTION 

Galaxy clusters, the most massive bound objects in 
the universe, provide several methods to constrain cos- 
mological models, for example through their abundance 
(e.g., Evrard 1989; Henry & Arnaud 1991; White, Efs- 
tathiou & Frenk 1993; Eke, Cole & Frenk 1996; Viana 
& Liddle 1999; Mantz et al. 2007), or their spatial dis- 
tribution (Schuecker et al. 2001; Refregier, Valtchanov, 
& Pierre 2002; Hu & Haiman 2003; Blake & Glaze- 
brook 2003; Seo & Eisenstein 2003; Linder 2003), or both 
(Schuecker et al. 2003). In large future surveys, with 
tens of thousands of clusters, percent -level statistical 
constraints are expected to be available on dark energy 
parameters (Haiman, Mohr & Holder 2001), including 
constraints on the evolution of its equation of state pa- 
rameter Wa = —dw/da (Weller, Battye, & Kneissl 2002; 
Weller & Battye 2003; Wang et al. 2004). 

In order to fully realize the cosmological potential of 
large cluster samples, it is important to understand the 
cluster mass-observable relations accurately, at least sta- 
tistically. It is very unlikely that the structure of clusters 
will be understood from ab-initio calculations to the level 
of precision required for the theoretical uncertainties not 
to dominate over the exquisite statistical errors (e.g. 
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Levine, Schultz, & White 2002). However, in principle, 
when multiple observables depend on the same mass, the 
mass-observable relation can be accurately determined 
from the data itself, simultaneously with cosmological 
parameters. Several works have proposed and quantified 
the constraints from such 'self-calibration" (Majumdar 
& Mohr 2004; Wang et al. 2004; Lima & Hu 2005), using 
parameterized phenomenological relations for the mass- 
observable relations (for example, power-law scalings, or 
arbitrary evolution in pre-specified redshifts bins). It has 
been argued recently (Younger et al. 2006) that even if 
cluster structure is not precisely predictable, parameter- 
ized physical models can further improve on such phe- 
nomenological self-calibration, especially when multiple 
observables (such as X-ray flux and Sunyaev-Zel'dovich 
[SZ] decrement) can be predicted from the same physical 
model (Younger et al. 2006). In hght of this potential, it 
is important to fit physically motivated cluster models to 
as many cluster observables as possible; one then hopes 
that future observations of larger cluster samples will re- 
quire further fine-tuning of these models, and, at the 
same time, deliver useful cosmological constraints (Os- 
triker. Bode & Babul 2005; Younger et al. 2006). 

The gravitational potential of clusters is dominated by 
dark matter, whose behavior is determined by gravity 
alone, and is therefore robustly predictable. The dark 
matter profiles of galaxy clusters, apart from the inner- 
most regions, are indeed well understood from three- 
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dimensional numerical simulations (Navarro, Frcnk & 
White 1997; Moore ct al. 1998), and are nearly self- 
similar, as expected. The physics of gas, on the other 
hand, involves complicated non-gravitational processes 
such as radiative cooling and star formation, galaxy evo- 
lution, and various forms of feedback. If these processes 
were unimportant, the intracluster gas would trace the 
self-similar dark matter profile, and its global proper- 
ties should obey simple scaling relations (Kaiser 1986). 
Specifically, its X-ray luminosity L, if dominated by ther- 
mal Bremsstrahlung, as for clusters with temperature 
r > 2 keV, should scale as L oc T^. This relation is 
indeed obeyed by clusters in hydrodynamic simulations 
without non-gravitational processes (Evrard, Metzler, & 
Navarro 1996; Bryan & Norman 1998). However, the ob- 
served L — T scaling relation is significantly steeper than 
the self-similar prediction, closer to i cx (Markevich 
1998; Arnaud & Evrard 1999). This demonstrates that 
the effect of non-gravitational processes on the intraclus- 
ter gas is not negligible, even for "bulk" observables. 

A long-standing proposal for the dominant such non- 
gravitational effect is that the intracluster gas is heated 
by some energy input (from star formation, supernovae 
explosion, galactic winds and/or active galactic nuclei 
[AGN]), raising the gas to a higher adiabat before the 
clusters collapse. Many authors have investigated the ef- 
fect of such a preheating, and have shown that simply 
imposing a minimum "entropy floor" for the intraclus- 
ter gas naturally breaks the self-similarity, and steep- 
ens the L — T relation as required by the data (Kaiser 
1991; Evrard & Henry 1991; Cavaliere, Menci, & Tozzi 
1997; Tozzi & Norman 2001; Babul et al. 2002; Voit et 
al. 2002). The pre heating idea is further supported by 
the discovery of excess entropy in the inner regions of 
low-temperature clusters, which suggests the existence 
of a universal entropy floor (Ponman, Cannon & Navarro 
1999; Lloyd-Davies, Ponman, & Cannon 2000), and by 
several other independent lines of evidence (for a brief 
summary and a list of references, see, e.g., Bialek, Evrard 
& Mohr 2001). 

A simple model of pre-heating consists of shifting the 
entropy profile by an overall additive constant, represent- 
ing the cumulative effect of non-gravitational processes, 
assumed to be roughly uniform throughout the gas (e.g. 
Voit et al. 2002). Recent work has tested this simple 
model, by comparing its predictions with hydrodynami- 
cal simulations (Younger & Bryan 2007, hereafter YB07). 
The model reproduces the simulation results very well, 
but comparisons with observations show that although it 
can predict the global X-ray scaling relations, the model 
can not reproduce the observed entropy profiles (Pon- 
man, Sanderson, & Finoguenov 2003; Pratt & Arnaud 
2005; Pratt, Arnaud, & Pointecouteau 2006) in detail. 
This requires the model to be further developed, but as 
far as the global properties are concerned, it appears to 
be successful, and it is therefore useful to understand the 
average properties of the intracluster gas. 

In this paper, we adopt this simple preheating model, 
and focus on comparisons with both the observed L — T 
scaling relations in the redshift range ^ z ^ 1, and the 
observed cumulative number counts of the X-ray clus- 
ters. A previous study (Bialek, Evrard & Mohr 2001) 
calculated the impact of preheating on the X-ray scaling 
relations, using a sample of 12 simulated clusters, and 



found a good fit to the data on local clusters (but has 
not explicitly compared the expected evolution to obser- 
vations, and has not made simultaneous predictions for 
the number counts). Our work is also somewhat simi- 
lar to a more recent study by Ostriker, Bode & Babul 
(2005), who present a more detailed physical model for 
the intracluster gas, and show that it can reproduce lo- 
cal X-ray scaling relations (this paper also did not study 
evolution) . 

Our goal here is to clariiy (i) whether the model can 
simultaneously explain both the scaling relations and 
number counts of galaxy clusters, and (ii) whether the 
amount of entropy required evolves with redshift. In 
comparing our predictions to the L — T scaling relations 
and the number counts, we also study the effects of scat- 
ter in the L — T and L — M relations, and the corre- 
sponding selection biases that arise in flux limited survey 
(Nord et al. 2007). Our flrst goal is motivated by our ear- 
lier study (Younger et al. 2006), in which we found that 
a similar preheating model, with the entropy adjusted 
to reproduce observed X-ray and SZ scaling relations, 
tends to overpredict the number counts of bright clus- 
ters, even with a relatively low normalization (cts = 0.7) 
of the power spectrum. A similar discrepancy was found 
by Ostriker, Bode & Babul (2005, although they used 
a higher normalization, erg = 0.84, and suggested that 
agreement can be recovered by lowering this value) . 

The rest of this paper is organized as the follows. In 
§ 2, we describe in detail the formalism to implement 
the preheating model. In § 3, we compare the predicted 
L — T scaling relations to observations, and find the best- 
fit entropy level at two diflerent redshifts. In § 4, we 
further test the model by comparing predictions for the 
number counts with observations. In § 5, we then study 
the effect of intrinsic scatters and the corresponding se- 
lection effects in flux-limited cluster surveys. In § 6, we 
discuss our results, and in § 7, we offer our conclusions. 

2. MODIFIED ENTROPY MODEL OF PREHEATING 

We adopt the terminology from the literature, and refer 
to the quantity 



as "entropy" . Here P and Pg are the pressure and density 
of the gas, and 7 is the adiabatic index. For an ideal 

gas, K is related to the formal thermodynamic entropy 
per particle s hy s — InK , with ,so a constant. In 
this paper, the baseline entropy profile to be modified is 
adopted from YB07^ , which is a fit to that of the clusters 
in AMR simulations (Voit, Kay & Bryan 2005) without 
non gravitational processes. The profile is self similar 
when expressed as a function of the gas fraction Jg, and 

^ To examine the sensitivity of our conelusions below to the 
choice of this baseUne profile, we also tried adopting the entropy 
profile of gas that traces the DM distribution in an NFW halo. We 
have verified that our main conclusion below, that the entropy floor 
increases with cosmic time, still holds in thic case. In particular, 
following the procedure in Younger et al. (2006), but assuming fg = 
0.9 and 20% non-thermal pressure support, we find Kq increases 
from 363+g at 2 = 0.8 to ^Olt^Jjh-^/'^ keV cm^ at 2 = 0.05 (these 
numbers include intrinsic scatter, and are to be compared with the 
values obtained in our fiducial model in § 5.1). 
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normalized by 



=0.18 + 0.2/, + 1.5/|. 

-A-vir 



(2) 



Here fg{< r) = Mg{< r)/(/bAfvir) is the gas mass 
inside radius r, normalized by the cosmic mass frac- 
tion of baryons (/b = Vth/Vlm) times the total virial 
mass of the cluster Myir. We further define Tyir = 
GMvir/imp/(2rvir), which is the temperature of the corre- 
sponding isothermal sphere (Voit et al. 2002). (Through- 
out this paper, we absorb ks into T, so temperature 
is in units of energy.) The mean molecular weight 
H = 0.59 is adopted for the intracluster gas, appropri- 
ate for a fully ionized H-He plasma with helium mass 
fraction Yho = 0.25; nip is the mass of proton, and 
rvir is the virial radius, -fi'vir is then calculated by 
Kyir = T^ir/ifbPvirV"^ /{p^mp), where Pvir is the mean 
density of the cluster within the virial radius (relates 
Mvir to Tvir by Mvir = ^Pvir^'viri scc below for its calcu- 
lation). 

The effect of preheating is then realized by adding a 
constant Kq to K{fg), 

0, (3) 

where the value of Kq can be determined once the 
amount of energy injected into the cosmic gas, and the 
density of the gas at the time of the injection, is speci- 
fied. Convectivc stability requires the specific entropy K 
to be a monotonically increasing function of radius (Voit 
et al. 2002), and hence of fg. The above prescription of 
preheating may change fg as a function of r, but it does 
not change the order of the gas shells. The entropy pro- 
file, together with the hydrostatic equilibrium and gas 
mass conservation equations. 



K^\fg)=K{fg)+K, 



dP 
dr 



= -VPg- 



GMtot(< r) 



dMg{< r) 
dr 



Anr pg 



(4) 
(5) 



can be used to solve for the pressure and density distri- 
bution of the intracluster gas. Combined with the equa- 
tion of state for ideal gases, the temperature profile of 
the gas also follows from the solutions. In equation (4), 
Mtot(< r) = AfDM(< r) + Mg{< r). The dark matter 
profile Mdm(< r) is known, and is given below. Includ- 
ing Tj allows deviations from strict hydrostatic equilib- 
rium. Here we adopt rj — 0.8, the value YB07 find in 
their simulations, suggesting that the remaining support 
for the gas is provided by turbulent motions. 

The boundary condition for Mg{< r) is naturally cho- 
sen to be zero at the origin (to avoid numerical difficul- 
ties, in practice we give Mg a small value at some small 
finite radius). The pressure at the same position is found 
by giving it a trial value and integrating equations (4-5) 
until the pressure at rvir matches the expected momen- 
tum flux of infalling gas, 



-P(r-vir) = 2 /6PNFw(r-vir) vi- 



ce) 



Here we assume the accreting gas is cold (Voit et al. 
2003), and that it falls freely from the turnaround ra- 
dius (rta) and is shocked at the virial radius. We assume 
fta = 2rvir, so that the free-fall velocity vs from rta to 



rvir is given by w| = GMvir/rvir(= 2Tyn- / ^xrup) . The 
postshock gas density is fbPNFW (see below for the calcu- 
lation of pnfw)- Under extreme conditions, the free-fall 
kinetic energy is totally transformed into thermal energy, 
and the post-shock gas has a pressure as given above; 
this value agrees with that adopted in YB07, matching 
their simulation results. (Besides the difference in iden- 
tifying clusters, our boundary pressure has a numerical 
factor of I compared to theirs of 0.7.) These two bound- 
ary conditions are sufficient for solving equations (4-5). 
The result is that the gas fraction fg within the virial ra- 
dius is 0.88 without preheating, and somewhat less when 
preheating is turned on. 

The matter distribution in virialized clusters is well 
described by the NFW (Navarro, Frenk & White 1997) 
model as found from N-body Pure CDM simulations. 
Adiabatic hydrodynamical simulations without non- 
gravitational processes find gas density profiles quite sim- 
ilar to the NFW shape, except in the central regions (Voit 
et al. 2002), where the gas density levels off. When pre- 
heating is turned on, the inner gas density profile be- 
comes even shallower and deviates more from the NFW 
shape. Since the gas is subdominant in mass, we neglect 
its effect on the distribution of dark matter. (Though 
it is found that gas will cause the dark matter halo to 
be slightly more concentrated; e.g. Lin et al. 2006.) For 
simplicity, here we assume the dark matter profile retains 
the NFW shape, i.e. pDM(r) = (1 — fb)pNFw{r)- For a 
cluster virialized at redshift z with mass Myir, its NFW 



density profile is given as, 

PNFW('') = 



ScPcXz) 



ir/rs)il + r/rsy 



(7) 



where pc is the critical density of the universe, and dc 
and r's are parameters determined from the concentration 
parameter c = r^ir/rg- We neglect the weak dependence 
of c on Mvir and z, and simply adopt a constant c = 5 
in this paper. We identify clusters virialized at redshift 
z as spherical regions with mean density pvir = ^vPc{z), 
with At, given as a fitting formula by Kuhlen et al. (2005, 
based on spherical collapse model). 



(8) 



where 6(2;) = ^^{z) — 1, ^m{z) is the matter density 
normalized by Pc{z), and a = 0.432 - 2.001(|w(2)|°-234 _ 
l),b = 0.929 - 0.222{\w{z)\°-™ - 1), with w{z) the dark 
energy equation of state. 

3. PREHEATING FROM THE L-T SCALING RELATIONS 

Once the density, temperature and pressure profiles of 

the intracluster gas are specified, global properties, such 
as the X- ray luminosity, the emission- weighted tempera- 
ture, and the Sunyaev-Zel'dovich decrement can be read- 
ily calculated. Here we compare predictions of the mod- 
ified entropy model for the luminosity-temperature scal- 
ing relations with those inferred from X-ray observa- 
tions. This choice is motivated by simplicity and ro- 
bustness: the total luminosity (L) and temperature (T) 
can be inferred from observations without referring to 
a model for the intracluster gas. Comparisons to rela- 
tions involving the mass of the cluster (such as the mass- 
temperature relation) are somewhat more direct from 
a theoretical point of view, but any such comparison 
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would, in any case, have to re-derive cluster masses, using 
information such as the observed X-ray surface bright- 
ness or temperature profiles, and using our own model, 
for a fair comparison with the data. We also empha- 
size that similar comparisons with SZ obscrvables will 
contain valuable additional information (e.g. McCarthy 
et al. 2003; Yoimger et al. 2006), and should be possi- 
ble soon with forthcoming data on cluster profiles from 
the Sunyaev-Zel'dovich Array (SZA) survey (Muchovej 
et al. 2007; Mroczkowski et al. 2007). We postpone such 
comparisons to future work. 
The X-ray luminosity i of a cluster is calculated as, 



L = / d\ 



dvne{r)nH{r)K{T{r),v) 



(9) 



where Ug — (1 — Fro H — 1^)™ is the number density 
of electrons, nn = (1 — Ihc)^ is the number density 
of protons, and A is the cooling function, calculated by 
a Raymond-Smith (Raymond & Smith 1977) code with 
metallicity Z = 0.3Zq. The integral is done over the 
cluster volume V and over frequency u. The emission- 
weighted temperature is calculated as, 



_ JdVjdupl{r)AiT{r),i.)Tir) 
JdVjd,.pl{r)A{T{r),,.) 



(10) 



The effect of preheating decreases the central density of 
the gas, but increases its temperature. The result is a 
lower luminosity and a higher Tg^; the combined effect 

at fixed Tew is a decrease in himinosity. 

We compare our predictions to two fiux limited sam- 
ples of X-ray clusters. One is the low rodshift High- 
est X-ray FLUx Galaxy Cluster Sample (HIFLUGCS) 
presented in Reiprich & Bohringer (2002), including 63 
clusters whose mean redshift is (z) = 0.05. The other is 
the high-redshift sample from the Wide Angle ROSAT 
Pointed Survey (WARPS) used in Maughan et al. (2006), 
including 11 clusters with a mean redshift of {z) = 0.8. 
For each individual cluster, we predict its observed tem- 
perature as the one weighted by the bolometric emission, 
^ and compare the bolometric luminosity, calculated at 
this temperature using the preheating model, with the 
observed value. To quantify the goodness of fit of this 
comparison, we define the usual x statistic, 



JV 

E 



[log£(T„z„j^o)-logL,]^ 

J / d loff Li \9 I 2 

1 (aiSiTlTi<^logTj^ + <gi, 



(11) 



Here (JiogTi and crxo^Li denote the observational mea- 
surement errors of (the base 10 logarithm of) temper- 
ature and luminosity, i.e. criogTi = (loge)^^, and 

flog Li is defined analogously (additional, intrinsic scat- 
ter in these quantities will be discussed below). We take 
Zi,Ti, Li, aTi, cTLi directly from the published observa- 
tional data, and L and are calculated from the pre- 
heating model. We fix the parameters of the background 
cosmology, adopting the flat ACDM model with the best- 
fit values from the WMAP 3-year results (Spergel et al. 

Wc find the difference of tliis temperature from tliat wcigfited 
by ttie band emission, wliieli is actually observed, is iess ttian 4%. 
Wc neglect this difTcrencc. We also checked the bias of the emission- 
weighted temperature when comparing to the observed spectro- 
scopic temperatures (see § 6.3 below). 



2007), i.e. {h, nmh^, ilbh^, as, n^) = (0.73, 0.13, 0.022, 
0.76, 0.96) 

Therefore, in this comparison, Kq is the only free pa- 
rameter to be determined by the fit (allowing variations 
in the cosmological parameters will be discussed below). 
We quote the best-fit entropy floor value by multiplying 
K, as defined in equation (1) above, by a constant factor 
of {prripP {n/neP~^ , with n = This is equivalent 

to redefining K as 



K = 



T 



(12) 



which is the definition widely used in the observational 
literature (e.g. Ponman, Cannon & Navarro 1999; Pon- 
man, Sanderson, & Finoguenov 2003; Pratt & Arnaud 
2005). For 7 = |, commonly used units for the entropy 
defined above is keV cm^, and 1 keV cm^ corresponds 
to ejecting 0.036(1 -|- 5if'^{l + zf{^^fl^ev per par- 
ticle to the fully-ionized plasma with overdensity ^6 and 
redshfit z. 

Note that the luminosity inferred from observations is 
cosmology-dependent, and since Reiprich & Bohringer 
(2002) and Maughan et al. (2006) adopt different values 
for the cosmological parameters, we re-scale their quoted 
luminosity (by the ratio of the luminosity distance- 
square) to our fiducial cosmology. 

The above procedure, applied to the low-redshift HI- 
FLUGCS clusters, yields the best-fit entropy floor of 
Kq = 295tl h''^'^ keV cm^. We find a total = 2293 
for this best fit model, or a x^ P'-i' degree of freedom 
(d.o.f.) of 37, indicating that the L — T relation has addi- 
tional intrinsic scatter (caused, possibly, by a cluster-to- 
cluster variation in the entropy floor itself; see discussion 
of scatter in § 5 below). For the high-redshift WARPS 
sample, we find the best fit Kq to be 172l^^/i-i/3 keV 
cm^. This fit has a total x^ = 7, or a per d.o.f. of 0.7. 

The L — T scaling relation predicted with the best-fit 
entropy floor at the average redshift of the HIFLUGCS 
clusters, z = 0.05, is shown as the solid curve in Figure 1, 
together with the re-scaled low-z data from Reiprich & 
B5hringer (2002). For reference, the figure shows the 
predicted L — T relations without an entropy floor (dot- 
dashed curve) and with the lower Kq inferred from the 
high-z sample (dashed curve). The comparison of the 
data with the Kq = curve clearly shows the need for 
the entropy floor, and the comparison with the Kq = 
cm^ curve shows that the observational 
data, especially of the 1-3 keV clusters, require that the 
entropy floor at 2 = 0.05 is higher than the best-flt value 
at z = 0.8. 

The solid curve in Figure 2 shows the model prediction 
for the L — T scaling relation with the best-fit entropy 
floor at the average redshift of the WARPS clusters, 
z = 0.8, together with the re-scaled data from Maughan 
et al. (2006). For reference, the figure again shows the 
predicted L — T relations without an entropy floor (dot- 
dashed curve) and with the higher Kq inferred from the 
low-z sample (dashed curve). The comparison of the 
data with the Kq = curve clearly shows the need for 
the entropy floor at high-z as well, and the comparison 
with the Kq = 295 h~'^/'^ keV cm^ curve shows that the 
high-z clusters favor an entropy floor smaller than the 
best-flt value at low-^;. 
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Fig. 1. — The L — T scaling relation predicted by the preheat- 
ing model with the best-fit entropy floor Kq = 2^bh~^/^ keV 
cw? at the average redshift z = 0.05 of the HIFLUGCS clus- 
ters (solid curve), together with data in this sample from Reiprich 
& Bohringcr (2002), re— scaled to the WMAP 3— year cosmology 
adopted in our work. For reference, we show the L — T relation at 
z = 0.05 predicted without an entropy floor {Kq = 0; dot— dashed 
curve) and with the lower entropy inferred from the high— z sample 
[Ko = 172/1-1/3 keV cm^; dashed curve; see Figure 2). Measure- 
ment errors on L are smaller than the size of the symbols. 




WARPS 
K„=0 

-K„=172 h"'kev cm' 
K =295 h "'kev cm' 



T(kev) 



Fig. 2. — The L — T scaling relation predicted by the preheating 
model with the best-fit entropy floor Kq = 172h~^^^ keV cm^ at 
the average redshift z = 0.8 of the high-redshift WARPS clusters 
(solid curve), together with the re-scaled data from Maughan ct 
al. (2006). For reference, we show the L — T relation a.t z = 0.80 
predicted without an entropy floor (Kq = 0; dot— dashed curve) 
and with the higher entropy inferred from the low-z sample {Kq = 
295/1-1/3 keV cm^; dashed curve; see Figure 1). 

A visual inspection of Figures 1 and 2 ( "chi by eye" ) 
indicates that the preheating model of a universal en- 
tropy floor, produced by energy input at an early epoch, 
can not fit the scaling relations of the low-redshift and 
high-redshift clusters simultaneously. (We discuss the 
significance of the detected evolution quantitatively be- 
low, in § 5.1 and in § 6.1.) It would be natural, in fact, for 
the entropy floor to increase with cosmic time, if the en- 
ergy input is being continuously provided by stars and/or 



AGN. Parameterizing the entropy evolution as a power- 
law in redshift, 

Ko{z) = Ko{z = 0)il + z)-'', (13) 

we can convert the two best-fit values of Kq for the two 
cluster samples at z = 0.05 and 0.8 to estimate Kq{z — 
0) = 310h^^^^ keV cm^ and a = 1. For reference, this 
power-law is shown in Figure 4. 

4. NUMBER COUNTS OF X-RAY CLUSTERS 

The preheating model described above, with the 
power-law approximation for the evolution of the entropy 
floor, can successfully match the observed L — T scaling 
relations. This model also predicts a deterministic rela- 
tion between cluster mass M and both the temperature 
and luminosity. The mass function of dark matter halos 
is well understood from both analytic models (Press & 
Schechter 1974; Bond et al. 1991; Sheth & Torman 1999) 
and numerical simulations (Sheth & Torman 1999; Jenk- 
ins et al. 2001). It is therefore natural to compare model 
predictions to observed clusters counts as a function of 
either temperature T or luminosity L (or equivalently, 
flux /). Here we chose to compare the model predictions 
to the log N — log / relation derived from the 158 deg^ 
ROSAT PSPC survey by Vikhlinin et al. (1998). This 
sample is ideal for our purposes, since it is both large 
and deep enough to provide a good measurement of the 
counts to faint fluxes, where the effects of preheating are 
more pronounced. 

We first use the best-fit cosmological model from the 
lyMylP 3-year results, and calculate 7V(>f), the expected 
surface number density of clusters whose X-ray fluxes 
exceed /. The counts are calculated as 



dz^iz) 
dzdQ 



Miso(/,z) 



dM 



(14) 

where d^V/dzdfl is the comoving volume element, and 
■j^ is the cluster mass function. In this paper, we use 
the fltting formula given by Jenkins et al. (2001) for the 
SO(180) group flnder of dark matter halos. The mass 
limit Miso{f,z) is determined by first finding the virial 
mass Mvir(/, z) of the cluster at redshift z that gives 
a flux /; then converting it to Migo(f, z) by extending 
the NFW profile of this cluster until the enclosed matter 
has a mean density of 180 times the background matter 
density at that time. 

The results are shown as the dashed curve in Figure 3, 
together with the observational data from Vikhlinin et 
al. (1998). The figure shows that the WMAP 3-year 
cosmology, together with the preheating model that fits 
the L — T scaling relations, underpredicts the cumula- 
tive number counts of X-ray clusters, especially at the 
low flux limits. Considering the sensitivity of the cluster 
number counts to erg, it is natural to ask whether the 
discrepancy can be resolved by increasing the value of 
erg and leaving all other parameters unchanged (clearly, 
variations in tJg will not modify the best-fit Kq inferred 
from the scaling relations). We therefore vary erg, and 
apply a statistic to the 158deg2 ROSAT PSPC data 
to find its best-fit value. We use 
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f (0.5-2.0kev) erg s ' cm"^ 



Fig. 3. — Cumulative number counts of galaxy clusters per deg'^ 
N{> f) as a function of the X-ray flux / in the 0.5-2 kcV soft X-ray 
band. Filled squares show data from Vikhlinin et al. (1998). The 
dashed curve shows predictions using the WMAP 3-year cosmology 
(in particular, ag = 0.76), and the L — M relation calculated from 
the preheating model without consideration of any intrinsic scatter 
('^lnL\T = '^lnL\M ~ 0)- The dotted curve corresponds to the case 
with intrinsic scatters of o-;„£,|t = 0.3 and ctiulIm = 0.59. The 
solid curve is similar to the dotted curve, except it is calculated 
with a higher erg = 0.8, which gives the best agreement with the 
data when scatter is included. 

whore i labels independent flux bin, A is the survey area. 
We inelude a simple Poisson error (uniform sky coverage 
at all flux limits) in the calculation of the variance in 
addition to the measurement error. We find the best-fit 
value of 0-8 = 0.82±0.02, which is larger than the WMAP 
3-year best-fit value as = 0.76 ± 0.05 (in the presence of 
scatter, our best-fit is reduced to as = 0.80 ± 0.02; see 
below). 

5. THE EFFECTS OF INTRINSIC SCATTER 

In the above two sections, we assumed that clusters at 
redshift z with fixed virial mass Mvir have temperatures 
and luminosity exactly as predicted by the preheating 
model. In reality, deviations from spherical symmetry, 
as well as cluster-to-cluster variations in non adiabatic 
processes, will lead to non-negligible scatter in these two 
quantities. For flux-limited surveys, such scatter will 
cause the observed scaling relations to deviate from the 
true ones (Stanek et al. 2006; Nord et al. 2007), and the 
counts to deviate from those of equivalent mass-limited 
samples without scatter. To make our analysis more re- 
alistic, it is necessary to take these effects into account. 
In this section, we repeat the calculations in the above 
two sections, but we include intrinsic scatter, which we 
model separately in the L — T and L — M relations. 

5.1. Scatter in the L — T Relation 

At a given redshift z, the joint probability distribution 
for L and T of a cluster with fixed A'/vir may be con- 
veniently modeled as a bivariate log normal distribution 
P{L, T I Mvir), with the logarithmic means determined by 
Mvir (Nord et al. 2007). Convolved with the cluster mass 
function, this can be used to predict the probability dis- 
tribution of luminosity for clusters at fixed temperature 
T. For a fiux-limited sample, the average and variance 



of L for these clusters can also be predicted. Here, since 
we care only about the final L ^ T scaling relation, for 
simplicity, we assume that P{L\T), the probability distri- 
bution function of L for clusters at fixed T is log-normal, 



P{L\T)dL 



(lni-lnL)2 
exp( —2 jdmL. 



(16) 

Given that the log-normal shape of P{L,T\M) is not 
particularly well justified to begin with, and that our re- 
sults are essentially more sensitive to the width of the 
P{L\T) distribution than its detailed shape, we regard 
this as a sensible approach. The logarithmic mean In L 
in equation (16) is taken to be the logarithm of the lu- 
minosity predicitecl by the preheating model for a cluster 
that has temperature T according to the same model. 
The scatter (TinL|T is taken to be a constant. Here we 
choose it to be 0.3, which is close to the value~0.4 ex- 
pected for current flux-limited [/(O.l — 2.4keV) ~ 3 x 
10^^^ erg/s/cm^] samples. In particular, Nord et al. 
(2007) derive this value by assuming a bivariate log- 
normal distribution of P{L,T\M), with intrinsic scat- 
ters ainL\M = 0.59, ainT\M = 0.1, power-law relations 
between the means, and a positive correlation between 
Ini and InT. 

For a flux-limited survey with a threshold fmin in the 
observer rest frame energy band [vi , P2] , the log mean 
luminosity of detectable clusters at fixed temperature T 

is given by, 

(,„L)(r) = EI + .,.„./f^^. (17) 

where erfc is the complimentary error function, x^in = 
inz^in-in j, ^ Lrnin is the luminosity corresponding to 

the flux threshold, Lniin = '^T^d'tiz) fmin/ K{T, z) . Here 
di^{z) is the luminosity distance, K is the ratio of the 
X-ray emission in the energy band [1^1(1 + z), 1^2(1 + z)] 
(cluster rest frame) to the bolometric luminosity, and is 
calculated by the preheating model for the same cluster 
when we calculate InL. Clusters with luminosity be- 
low Lmin arc not included in the average. So, (In L) is 
larger than that for the complete sample (the so-called 
Malmquist bias). The variance for the log of the lu- 
minosity for the flux-limited sample can be calculated 
similarly. 



{{\nL-{\nL)f){T)=al^,^x 



exp(-a;,ii„) 2 exp(-2x^i„) 



■^7? erfc(a;n 



TT erfc {Xn 



(18) 



Note that equations (17) and (18) have manifestly cor- 
rect limiting behaviors: in the Umit Lmin — > 0, (InL) ^ 



InL and ((InL - (InL))^) ^ a 
00, we have (InL) 

0. 



lnI,|T' 

L, 



whereas in the 
ain and ((InL - 







limit Lii] 
(Ini))^) 

To take into account the above, we modify the cal- 
culation of the statistic for the two flux limited 
cluster samples. Specifically, in equation (11), we re- 
place the average \ogL{Ti,Zi,KQ) by (lnL)(T'j, 2;,, ii'o) x 
(loge), and the variance ( a{ogr l'^i'^'»g'^«)^ ((i^-^ ~ 
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(In L) Y) {Ti, z) X (loge)^. Measurement errors in the tem- 
perature may further modify the average and variance of 
the i*^ cluster's luminosity; here, we include this effect 

approximately by simply adding a term (^ ^q^^ ciog Ti^ 

to the intrinsic variance. 

With these alterations of the statistic, we find 
that the best-fit entropy floor Kq for the low-redshift 
HIFLUGCS clusters is increased to 327t'igh-^/^ keV 
cm^ (with the per d.o.f of 2.2), and the best-fit Kq 
for the high-redshift WARPS clusters is increased to 
209tlth-^/^ keV cm^ (with the per d.o.f of 0.5). ^ 
Since Malmquist bias shifts the average luminosity to 
a larger value, more entropy is needed to bring the 
model prediction to agree with the observations (see § 6.6 
for more discussions on this), but the increase is only 
Ri 10 — 20%. More importantly, however, we see that the 
significance of the difference in Kq between the high- and 
low-redshift samples is reduced, but remains at the in- 
teresting level of (327 - 209)/(\/192 + 662) ^ i (gee 
§6.1 for more about this.) We find that the power-law 
approximated evolution of the entropy floor changes to 
Ko{z) = 341(1 + 2)-o-83/i-V3 keV cm^. 

5.2. Scatter in the L — M Relation 

In this section, as before, we assume that the bolo- 
metric luminosity L for clusters with virial mass Mvir at 
redshift z has a log-normal probability distribution, 

P{L\Myir,z)dL =^=^ exp(- ^^"/'7^"'^^^ dlnL. 

(19) 

The log mean In L is calculated as the logarithm of the 
luminosity predicted for the cluster by the preheating 
model with the evolving entropy floor found from § 5.1. 
The scatter is taken to be a constant; we adopt the 
value crinL|M = 0.59 derived by Stanek et al. (2006) from 
matching the predicted cluster counts to the REFLEX 
survey results (Bohringer et al. 2004). 

The fraction of clusters with flux above /, or luminosity 
above imin, is then simply 

P{> /IMviO = ^erfc(ar^i„), (20) 

where imin and Xmin are calculated as in § 5.1. Finally, 
the number counts are given by 

^) = / ''^^^^ I ^(^'^)^(> f\M..,z)dM. 

(21) 

Note M, the mass of the cluster employed in the mass 
function, is defined by an overdensity of 180 of the back- 
ground matter density, different from Mvir. As before, 
the NEW profile is used to convert Afvir to M. The 
counts predicted in this model with scatter are shown 
as the dotted curve in Figure 3. The difference from 
the original calculation, assuming no intrinsic scatter 
(dashed curve), is relatively small. Although a non-zero 
cinLlMi by itself, tends to significantly increase the num- 
ber counts, we are also allowing the log mean luminosity 

Two clusters in the high-rcdshift WARPS sample are removed 
here because they fall below the nominal flux threshold given in 
Maughan et al. (2006). 



In L (at fixed M) to change. As explained in the pre- 
ceding subsection, a non zero (TinL|T necessitates more 
entropy in order to match the L — T scaling relations, 
and tends to reduce Ini (at fixed T, and also at fixed 
M), and hence to decrease the number counts. The com- 
bination of these two effects is that A^(> /) increases, but 
only by a relatively small factor 20%). 

By repeating the analysis as is done at the end of § 4, 
we find that when all other cosmological parameters are 
kept fixed at the best-fit values from the WMAP 3-year 
results, the preheating model that agrees with the L — T 
scaling relations at both low and high redshift reduces the 
best fit value of (Tg by a small amount, from 0.82^q q2 to 

o.sol 

ao2 i^^^ Figure 3). The latter value still exceeds 
the best-fit value from the WMAP 3-year data, but be- 
comes marginally consistent with their la error. We also 
note that our best-fit erg = 0.80 agrees well with the 
value found by Lesgourgues et al. (2007) from a com- 
bined analysis of Lyman-a forest, 3D weak lensing and 
the WMAP year three data. 

6. DISCUSSION 

In this section, we discuss, quantitatively, a range of 
issues that should help understand our results and assess 
their robustness. 

6.1. Significance of the Inferred Entropy Evolution 

Perhaps our most interesting result is the increase in 
the entropy fioor from the z ~ 0.8 to the z ~ 0.05 clus- 
ter sample, and therefore here we discuss the statistical 
significance of this difference. In our analysis above, we 
have assumed a constant (not evolving) intrinsic scatter 
finiiTj adapted from the work of Nord et al. (2007), re- 
sulting in a « 1.7fT detection for the difference in the 
entropy floor values at z = 0.8 and z = 0.05 (see § 5.1 
above). In reality, the measurement errors of the low- 2; 
cluster sample are much smaller than those of the high- 
z sample, and the intrinsic scatter can, in fact, be in- 
ferred self-consistently from the L — T relation we fit. 
Here we repeat the analysis in § 5.1, but we allow the 
scatter ai^^j^^T to vary, and attempt to adjust its value 
to find a x^ per degree of freedom of unity, for both 
cluster samples. We find that the low-2; sample then re- 
quires a scatter of ai-nL\T = 0.49, which is larger than 
our adopted value. Using this larger scatter shifts the 
best-fit entropy level to 372l^^ft-i/3 keV cm^. For the 
high-z sample, we find that the measurement errors are 
so large that the best fit model has a x^ P'-r d.o.f is 
less than 1 (w 0.7) even in the absence of any intrinsic 
scatter. We conclude that the current data can not yet 
be used to establish evidence for any intrinsic scatter in 
the high-2; sample. The best motivated statistical com- 
parison, then, is between the best-fit Kq we obtain for 
the low-z sample with (7inL\T — 0.49, and the best-fit 
value for the high-z sample obtained with ainL\T = 
(172t35/i-i/3 keV cm2, see § 3). This implies a signif- 
icance of the difference between the best-fit values of 
(372-172)/(V372 + 352) » 4cr (with the best-fit power- 
law evolution changing to Ko{z) = 398(1 + 2)-i-43/j-i/3 
keV cm^). Clearly, better temperature measurements 
for the high-z clusters would help determine whether the 
intrinsic scatter evolves, which would be important to 
validate this result. 
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The entropy floor has a larger impact on the smallest 
clusters, and one may wonder to what extent the inferred 
entropy floor is driven by the two low-temperature clus- 
ters in Figure 1 that lie visibly below the best-fit relation. 
When we omit these two clusters and repeat our analy- 
sis with the rest of the HIFLUGCS sample, we find that 
the best- fit entropy floor decreases by 7%, from 295^5 to 
273^5 h~^/^ keV cm^ when ignoring intrinsic scatter in 
the analysis, by 12%, from 327^?^^ to 2S7t^^h-^/^ keV 
cm^, when including intrinsic scatter {<J\nL\T — 0.3), and 
also by 12%, from 372^^^ to 329^3^1/1-1/3 keV cm^, when 
including a larger intrinsic scatter {cr\aL\T — 0.49). The 
4(7 significance of difference claimed above now reduces 

to 3(7. 

Finally, we use an alternative statistic to assess the sig- 
nificance of the difference between the high-z and low-z 
entropy floors. We derive the entropy floor Kq for each 
individual cluster in the two samples by simply setting 
L{Ti, Zi, Kq) = Li (following the notation in § 3). This 
results in a range of Kq values, shown by the symbols 
in Figure 4, which can be used to construct two sepa- 
rate ifo^distributions, for the high-z and low-z samples. 
We then apply a Kolmogorov-Smirnov (KS) test to the 
two Xo-distributions. We find D = 0.4 and a P-value of 
0.07, which makes it unlikely that the two sets of K^) val- 
ues were drawn from the same underlying distribution. 
Unfortunately, this test remains inconclusive at present, 
since, as mentioned above, the observational errors on the 
temperature are much larger in the high-z sample than 
in the low-z sample, and this difference alone introduces 
a difference in the inferred Kq distributions. Further- 
more, the intrinsic scatter may evolve between the two 
redshifts due to reasons unrelated to the entropy floor. 
Indeed, this is suggested by the presence of negative Kq 
values in the low-z sample, which presumably arises from 
un-modeled processes that brighten some clusters' X-ray 
emission (e.g. cooling cores). In order to conclude that 
the KS tests detects a true evolution (either in entropy, or 
in some other process modifying the luminosity distribu- 
tion at fixed T), we would have to explicitly model the 
observational errors, which is not yet warranted, given 
the large errors in the high-z sample. 

6.2. Evolution of the X-ray Scaling Relations 

Our analysis requires evolution in the entropy floor, 
which also predicts a specific evolution in the L — T scal- 
ing relation. In this section, we compare these evolu- 
tions with those derived from observations in previous 
work. A particularly relevant study is by Ettori et al. 
(2004, hereafter E04), which examines the evolution of 
the entropy K inferred from the X-ray scaling relation, 
with K measured at 0.1-R200- They find the entropy K 
at fixed temperature evolves as (1 -I- z)^^^ / E^^^ , corre- 
sponding, from z = 0.8 to z = 0.05, to a 50% increase, 
which appears, naively, to be in good agreement with our 
finding. However, we caution that E04 measure K at 
O.li?200i which may include a contribution from gravita- 
tional shock heating, especially in higher-mass clusters, 
and therefore will not correspond directly to the preheat- 
ing entropy we derive (furthermore, the /3-modcl density 
profile assumed in E04 differs from the one in our model) . 

Our high-z sample is taken from Maughan et al. (2006), 
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Fig. 4. — The entropy floor inferred for individual clusters in the 
HIFLUGCS and high-redshift WARPS samples, shown against the 
cluster's redshift. The narrower (red) error bars are obtained by 
allowing the predicted luminosity for the cluster to vary within the 
Icr regions allowed by observational errors, while the wider (green) 
ones are obtained by additionally including a constant intrinsic 
scatter in luminosity at fixed temperature (Tinj^|T = 0.3. The 
curves are the power— law evolution for the entropy floor obtained 
by without taking into account of the intrinsic scatter (dashed 
curve) and with the intrinsic scatter of = 0.3 (solid curve). 

which also analyzed the evolution of the L — T relation, 
and found that this evolution is consistent with the ex- 
pectation in self-similar models (with no preheating). 
How can this be reconciled with our results? We first 
note that the high-z sample in Maughan et al. (2006) 
includes only clusters with T > 3keV, and that their in- 
ferred evolution relates to the normalization of the best- 
fit power-law relations (whereas our L — T relations are 
not power-laws). For a clear illustration of how the two 
results can be reconciled, we return to our calculations 
without intrinsic scatter. In Figure 5, we reproduce the 
mean L — T relations from Figure 1 and Figure 2, and 
overlay the six model curves in a single figure. Note that 
the lowest solid curve and the middle dashed curve are 
predicted at our best-fit entropy levels for the low-z and 
high-z sample, respectively. Comparing these two curves 
with those predicted with Kq — 0, we find our evolving 
entropy floor predicts a self-similar-like evolution for the 
the L — T scaling relations when T > 3 keV, in agree- 
ment with Maughan etal. (2006). This figure also clearly 
shows that a constant but non-zero Kq can not mimic a 
self-similar-like evolution. 

This can be explained by the following: for clusters 
at the same redshift, the same entropy level {Kq) af- 
fects the low-temperature clusters more than it does the 
high-temperature ones, because the latter have larger 
characteristic (gravitational-heated) entropy. (This, of 
course, is well known, and it is the effect that leads to 
a larger fractional reduction in the luminosity for the 
low-T systems, steepening the L — T scaling relations.). 
Similarly, for clusters with the same T but at different 
redshifts, a constant entropy level leads to a larger frac- 
tional reduction in the luminosity for the higher redshfit 
clusters, because they have larger characteristic density 
and a smaller entropy. As a result, maintaining the self- 
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Fig. 5. — L — T scaling relations predicted by the preheating 
model at different redshifts and different entropy levels. Solid lines 
are at redshift z = 0.05, and dashed lines are at redshift z = 0.8. 
In both set of lines, from bottom to up the entropy floors are set 
at 295, 172, h'^/^ keV cm^. The curves are reproduced from 
Figures 1 and 2. 

similar-like evolution requires less entropy at higher red- 
shift. 

- ^ -'2/3^^^s,l/3 ^^^^^ ^ ^ 



Provided T cx M 



^{zY/^, we have 



Tpvii{z) to maintain self-similar evolution of L at 
fixed T, we would need Kq cx pviriz)^^^^ ■ Taking 
Kq = 295 at z — 0.05, this requires Kq — 136 aX z — 0.8, 
20% smaller than our best-fit value at this redshift. This 
indicates that the evolution of our L — T scaling relation 
is not exactly self-similar, but a little shallower. Figure 
4 in Maughan et al. (2006) is indeed consistent with this 
small deviation from self-similarity. 

6.3. Bias of the Emission-weighted Temperature 

In our analysis above, we have compared the predicted 
emission-weighted temperature T^^ for a cluster to its 
observational counterpart. Since the latter is generally 
obtained by fitting a thermal model to the observed spec- 
trum, in general the former is a biased estimator. In par- 
ticular, Mazzotta et al. (2004) have demonstrated that 
Tow always overestimates the spectroscopic temperature 
if the cluster has a complex multi-temperature thermal 
structure. They proposed alternatively using a so-called 
spectroscopic-like temperature Tgi, which they found to 
be within a few percent of the actual spectroscopic tem- 
perature, measured for simulated clusters hotter than 2- 
3 keV. To quantify how the bias in Tew affect our re- 
sults, we adopted the formula for Ts\ from Mazzotta et 
al. (2004) and repeated our calculations. We find Tgi is 
larger than Tow by around 10%. The result is that the 
best-fit entropy level shifts to a higher value: from 327 
to 420 h-^/^ keV cm^ for the HIFLUGCS sample, and 
from 209 to 287 h'^'^ keV cm^ for the WARPS sample, 
giving the evolution of Kq{z) = 436(1 + z)-o '^i h'^'^ 
keV cm^. (The effects of intrinsic scatter in crinL|T and 
Malmquist bias are included in these results, as in § 5.1.) 

6.4. Parameter Degeneracies 



An obvious issue, even within the context of the simple 
model adopted in our study, is that of parameter degen- 
eracies. A full multi-dimensional degeneracy study is left 
for future work; here we examine only the variations be- 
tween parameters that we expect may have the largest 
effect on our conclusions. 

Overall degeneracy between rj, Kq, and as- In our fidu- 
cial model, we have included 20% non-thermal pressure 
support (i.e. rj — 0.8). This choice is motivated by 
simulations that reveal turbulent motions in the intra- 
cluster gas (Norman & Bryan 1999; Faltenbacher et al. 
2005; Younger & Bryan 2007). Including turbulent sup- 
port in the analytical model is indeed necessary in order 
to reproduce in detail the density and temperature pro- 
files for the intracluster gas in simulations with preheat- 
ing (Younger & Bryan 2007). There is also direct ob- 
servational evidence for turbulence in the Coma cluster 
(Schuecker et al. 2004). In addition to turbulence, how- 
ever, relativistic particles accelerated by cosmic shocks 
or other mechanisms can provide further pressure sup- 
port for the intracluster gas (Miniati 2005). In order to 
account for the possibility of such an additional pressure 
component, we repeated the analysis of the previous sec- 
tions, but changed the value of rj from 0.8 to r/ = 0.7. 
This new calculation serves, more generally, to quantify 
the impact of uncertainty in the non-thermal pressure 
component on our result. 

We find that more non-thermal pressure support de- 
creases both the density and the temperature for a cluster 
at fixed mass, and decreases both its L and Tew- How- 
ever, at a fixed Tow, we find that L is slightly increased. 
As a result, in order to reproduce the observed L ^ T 
scaling relations, more entropy is needed (both at low 
and high redshift) . We find the best-fit evolving entropy 
floor is changed to Kq{z) = 381(1 + z)-°-84/i-i/3 
cm^. After preheating is included, keeping the WMAP 
3-year cosmological parameters fixed, the model under- 
predicts the number counts even more, as a result of the 
decreased luminosity at fixed virial mass. Treating erg as 
a free parameter, we find the best-fit value is increased to 
cTg — 0.86. (Intrinsic scatters are included in the analysis 
as in § 5.) According to this analysis, non-thermal pres- 
sure support is degenerate with both the entropy floor 
and the normalization of the power spectrum: a 50% in- 
crease in non-thermal pressure results in a 8% increase 
in (Tg and an « 12% increase in Kq (with virtually no 
effect on the slope of the entropy-evolution). 

Degeneracy between Kq and cg from dN/df. We found 
above that if the entropy floor Kq is fitted from the scal- 
ing relations alone, then the best-fit erg is somewhat 
higher than the preferred WMAP 3-yr value. It is in- 
teresting to quantify the degeneracy between Kq and CTg 
from the counts alone - in particular, to see how large a 
change in Kq is required if one insists on the preferred 
WMAP 3-yr value of erg = 0.76. We fix the power-law 
form of the evolution, Kq{z) = Kq{z = 0)(1 + z)~°-^^ 
(and include a scatter ainL\M = 0.59, as before), and 
compute the statistic from the number counts, varying 
Kq{z = 0) and cg simultaneously. The results are shown 
in Figure 6. As this figure reveals, the best-fit Kq varies 
monotonically with erg, by a factor of « 2 over the range 
0.7 < fjg < 0.85. Also, the best-fit value for erg from the 
L — T relation is significantly discrepant (at the ~ 2a 
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Fig. 6. — Constraints from the observed X-ray cluster number 
counts (Vikhlinin at al. 1998) on the normalization of the evolving 

entropy floor Ko{z) = Kq{z = 0)(1 + 2)-0.83/i-l/3 kcV cm^, for 
different fixed values of as- The y— axis shows the , computed 
from equation (15) (Predictions for the counts include the intrinsic 
scatter of (Tin ^ | m ) • 

level) from the central WMAP 3-yr value of erg = 0.76; 
this discrepancy can be ehminated by increasing Kq by 
« 20%. 

Degeneracy between O'nd a%. Cluster number 
counts produce a well-known degeneracy between 
and fTg, approximately of the form =constant for 

shallow X ray counts (e.g., Eke, Cole & Frenk 1996; Bah- 
call & Fan 1998). 

To examine the impact of uncertainty in U,m on our 
results, we changed Vtm from 0.24 to 0.30 (corresponding 
to change fim/i^ from 0.13 to 0.16). We otherwise fix the 
WMAP 3-ycar cosmological parameters, and re-fit the 
L—T scaling relations. Wc find that the best-fit evolving 
entropy floor is decreased significantly, by ~ 40%, to 
Ko{z) = 194(1 + 2)-o-72/j-i/3 ]jgv cm2. This can be 
understood easily: increasing decreases the cosmic 
baryon fraction. For a cluster with fixed {M^i-^.z), the 
baryon content is therefore decreased. This reduces its 
luminosity with the same entropy floor. On the other 
hand, the temperature is essentially unchanged. The net 
result is that the normalization of the L — T relation 
is reduced, and less entropy is needed to bring it into 
agreement with the observations. 

The model with the best-fit entropy floor is then found 
to over predict the X-ray cluster counts. This is mostly 
due to the increase in the underlying mass function 
dn/dM, though wc also flnd increased detection prob- 
ability for the low mass clusters at a given flux limit, 
which may be caused by increased mean luminosity, re- 
duced luminosity distance, etc. Allowing as to vary, we 
find the best-fit value of ug = 0.66. This value is smaller 
than (jg = 0.72, the value expected from the usual de- 
generacy asi^m^- (Intrinsic scatters are included in the 
analysis as we do in § 5.) 

6.5. Which Clusters Are Responsible for the Number 
Counts Constraints? 
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Fig. 7. — The upper panel shows the redshift distribution of 
the cumulative number density of the X-ray clusters, predicted 
at the flux thresholds of the four data points displayed in Fig- 
ure 3. Our best-fit preheating model is used with Ko{z) = 
341(1 -I- z)-0-83/i-i/3 keV cm^ and the WMAP 3-year cosmol- 
ogy, except with its = 0.8. The lower panel shows the mean mass 
corresponding to the four different flux thresholds, as a function of 
redshift. 

It is useful to know, within our model, the masses and 
redshifts of clusters that dominate the number counts. 
In Figure 7, we show dN{> f)/dz and M^i^{f,z) at the 
four different flux thresholds we utilized from Vikhlinin 
et al. (1998). The constant intrinsic scatter of Oi^j^^m 
is adopted in the calculation of dN/dz, so Mvir(/, 2) is 
actually the mass of the clusters that have 50% detection 
probability. As this figure shows, most of the clusters 
are in the range 0.05 ^ z <^ 0.15 and have masses of 
« 0.5 — 2 X 1O^^M0. The results also show that we have 
included some low mass clusters (fewx lO^'^AfQ), but the 
number of these clusters only constitute a small fraction 
of the total, e.g. at / = 5.05 x 10~^^erg/s/cm^, the 
fraction of clusters with Mvir < 5 x lO^^M© is ~ 13%. 

6.6. The Impact of Malmquist Bias on the Entropy 

Floor 

As an "academic exercise" , it is useful to assess the im- 
pact of incorporating a flux limit, and the corresponding 
Malmquist bias, into our analysis. For this purpose, we 
assume that there is an intrinsic scatter of ai-nL\T = 0.3 
as before, but wc do not apply any flux limit (this cor- 
responds to setting Xmm ^ — oo in Section 5.1). The 
best-fit entropy floor is found to be 333/i~^/^ keV cm^ 
for the HIFLUGCS clusters and Uih-^/^ keV cm^ for 
the WARPS clusters (the two clusters with fluxes below 
the claimed flux limit are excluded, for the purpose of 
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fairly comparing with the results that take into account 
of the effect of the flux limit). The entropy evolution is 
now Kq{z) = 353(1 + z)-i-2/i-i/3 keV cm^. 

Compared with the results with no intrinsic scatter, 
the entropy levels favored by these two cluster samples 
both increase. This increase is caused by the constant 
intrinsic scatter added to the denominator in the calcula- 
tion of the analysis, which changes the relative weight 
of each cluster (more specifically, reducing the down- 
weighting of the (small) clusters that require a higher 
entropy floor) . 

Compared with the results that include the intrinsic 
scatter and also apply the survey flux limits, the entropy 
level for the HIFLUGCS sample increases a little, while 
for the WARPS sample, it decreases. Overall, the im- 
pact of the flux limit is surprisingly modest. One naively 
expects that the clusters that are most important for de- 
termining the best-fit value for the entropy floor are the 
smallest ones, i.e. those just above the detection thresh- 
old, which are most susceptible to bias effects. In par- 
ticular, a naive expectation is that this bias will increase 
the average luminosity, and will require a larger entropy 
floor. It is therefore worth understanding the relative 
insensitivity of our results to imposing a flux limit. 

The effects of applying the flux limits have been an- 
alyzed in Section 5.1: in addition to increasing the av- 
erage luminosity, it also decreases the intrinsic scatter. 
The former effect shifts the best-fit entropy to a higher 
value, while the latter preferentially increases the value 
of at a larger entropy floor, and effectively shifts the 
best-fit entropy level to a lower value. Depending on 
the competition between these two effects, the net result 
may be either a larger or a smaller value for the best- 
fit entropy floor. To clarify this competition, we per- 
form an intermediate calculation, in which the effect of 
the flux limit is included only on the average luminosity 
(i.e. artificially ignoring the corresponding reduction in 
the scatter). We find the HIFLUGCS clusters now favor 
Ko = Allh-^/^ keV cm^, and the WARPS clusters favor 
Kq = 251/i-i/3keV cm . These value are much larger 
than the values obtained by assuming there are no flux 
limits, demonstrating that the robustness of the inferred 
entropy floor results from the above-mentioned cancela- 
tion. We conclude that provided the intrinsic scatter is 
known a~priori (before a flux limit is applied), the effect 
of Malmquist bias on the inferred entropy floor is small. 

6.7. Predictions for the SZ Decrement 

As mentioned above, our model fully determines other 
possible observables, such as those that can be mea- 
sured with the SZ effect. In Figure 8, we plot predic- 
tions for the I2500 ~ T scaling relation, together with 
the data from Bonamente et al. (2007) (see also Reese 
et al. (2002); McCarthy et al. (2003); Bonamente et al. 
(2006); LaRoque et al. (2006) for further discussions of 
SZ decrements). Here I2500 is the integration of the usual 
Compton parameter over the solid angle extended by the 
cluster within the projected radius of r25oo (the radius 
that gives a mean enclosed density of 2500 times of the 
critical density) , and T is the (X-ray) emission-weighted 
temperature as before. The solid curve corresponds to 
our preheating model with the best-fit evolving entropy 
floor, and the dashed curve, for reference, shows the pre- 
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Fig. 8. — The y2500 — T Sunyaev-Zel'dovich scaling relations 
predicted by the preheating model with the best-fit evolving en- 
tropy floor given by Ko{z) = 341(1 -t-z)""-*"^ h~-'/''keV cm^ (soUd 
curve), and without an entropy floor (dashed curve) at redshift 
z = 0.2. The points with error bars are data from Bonamente et 
al. (2007) for clusters within the redshift range of [0.1,0.3]. 

diction in model without preheating. Both are made at 
the mean redshfit of the data z = 0.2. A visual inspection 
of the dashed and solid curves ( "chi by eye" ) indicates 
the data requires preheating, and that the entropy level 
we found from the X-ray scaling relations roughly agrees 
with the data. A thorough investigation of the SZ pro- 
files, compared with the X-ray profiles, is likely to yield 
interesting new constraints on preheated cluster models 
(e.g. Cavaliere, Lapi, & Rephaeli 2005), but we leave this 
to future work. 

6.8. Moore vs. NEW Dark Matter Profiles 

High-resolution numerical simulations suggest that the 
dark matter distribution in the central regions of virial- 
ized haloes is significantly steeper than the NFW shape 
(Moore et al. 1998; Klypin et al. 2001). To examine the 
dependence of our results on possible variations of the 
dark matter profile, here we adopt 

^^"■^^ (r/r,)i-5(l + r-/r,)i-5 ^''^ 

with a fixed concentration parameter c = 4, and re- 
compute our results. We find that the steeper dark mat- 
ter profile gives a higher central density and tempera- 
ture for the intracluster gas, so at a fixed A/vir, both 
L and T are increased, but at a fixed T, the luminos- 
ity is decreased. As a result, less entropy is needed for 
the preheating model to agree with the observed L — T 
scaling relations. We find the favored evolving entropy 
HoovisKoiz) = 265(l + z)-"-^5/i-i/3 keVcm^. With the 
WMAP 3- year best-fit cosmology, the model owerpredicts 
the number density of the X-ray clusters, and the best-fit 
cTg is lowered to 0.74. (Intrinsic scatters are included in 
the analysis as we do in § 5.) 

We also use this steeper dark matter profile to predict 
the SZ observables yo and l2500- We find, similarly to 
L and T, that at a fixed Mvir, both yo and I2500 ai'e in- 
creased; at a fixed T, however 1^2500 is decreased, but yo 
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Fig. 9. — The contributions to X-ray luminosity (L, upper 
panel), central SZ decrement (yo, central panel), and the integrated 
SZ decrement {Y^i,., lower panel) from logarithmic radial bins, for 
both the NFW case (solid curves) and Moore et al. case (dotted 
curves). The profiles are shown for two clusters at z = 0.2 with 
the same temperature T = 4.93kcV, predicted with Kn = 0. The 
figure demonstrates that j/o is more sensitive to the central regions, 
and is increased by steepening the DM profile though the cluster 
actually has a smaller Mvir with this profile. (See discussion in the 
text.) 

is increased. To clarify these changes, we examined two 
clusters with the same temperature of T ~ 5 keV at z = 
0.2 (predicted by setting Kq = 0; we find this requires 
the cluster to have a mass of Mvir = 9.3 x IO^^Mq for the 
NFW case, and Afvir = 7.3 x lO^^M© for the Moore et al. 
case). Since Mvir for the Moore et al. case is smaller, it 
is understandable that L and I2500 also get smaller (from 
L = 8.02 X lO^^erg s'^ to L = 7.25 x lO^'^erg s'^, and 
from F2500 = 5.25 x lO'^^ to ^2500 = 4.65 x IQ-i^). How- 
ever, uq must be more sensitive to this steeper dark mat- 
ter profile than the other two observables to finally get 
an increase (from yo = 5.84 x 10^^ to yo = 7.18 x 10^'"'). 
In Figure 9, we explicitly show the contributions to L, yo 
and Yvir (similar to i2500j except the integration is done 
within the projected radius of rvir)^ from logarithmic ra- 
dial bins for both the NFW case (solid curves) and Moore 
et al. case (dotted curves) . This figure clearly shows that 
yo, L and Y is dominated by increasingly larger logarith- 
mic radius bins. This behavior can be explained by the 
fact that the X-ray luminosity and the integrated SZ 
decrement are integrations over volume (oc r^), whereas 
the central SZ decrement is integration over the line of 

® We show Yvir instead of 12500 in order to remove the additional 
geometrical weighting of different radial bins. For reference, Yvir 
decreases from 1.8 x 10"^^° in the NFW case to 1.2 x 10"^^° in the 
Moore et al. case. 



sight (oc r). 

6.9. Comparison with Younger et al. (2006) 

With our best fit preheating model, adjusted to satisfy 
the X ray scaling relations, and with the WMAP 3-year 
cosmology, we found that the cumulative number counts 
of the X-ray clusters were wnrferpredicted. This is differ- 
ent from the conclusions in earlier work (Younger et al. 
2006), which found an overprediction in a similar model 
(Ostrikcr, Bode & Babul 2005 also found an overpre- 
diction, using a higher erg and a more elaborate cluster 
structure model). By comparing our prediction (without 
intrinsic scatter) with that of Younger et al. (2006), we 
find that the discrepancy can be attributed to four differ- 
ences between our calculation and theirs. First, we use 
a larger value of the entropy floor in the redshift range 
where the clusters dominate the number counts, com- 
pared with the constant entropy floor of 194/i^^/"^ kcV 
cm^ adopted by Younger et al. (2006). Second, we use 
the WMAP 3-ycar cosmological model with erg ~ 0.76 
instead of the WMAP 1-year cosmological model with 
as = 0.7. Third, in our preheating model, we use the 
fitting formula for the baseline entropy profile from hy- 
drodynamic simulations, which is higher in the central 
regions than that adopted by Younger et al. (2006), and 
fourth, we also include 20% non-thermal pressure sup- 
port. All of these differences (except for our larger ag) 
lead to reductions in the number density, and the amount 
of reduction is larger than the increase caused by erg, 
leading to a net decrease in the predicted counts. 

6.10. Expected Entropy Evolution 

Since we find evidence for a significant increase in the 
entropy floor from the z ~ 0.8 to the z ~ 0.05 cluster 
sample, it is interesting to ask whether such an evolu- 
tion is indeed expected if energy is continuously being 
injected into the intra-cluster gas. It is possible to es- 
timate the entropy history of the IGM from the known 
global evolution of AGN and star formation rate. For 
example, Valageas & Silk (1999) find that the mean en- 
tropy level of the IGM is increasing with time in both 
scenarios. In this case, clusters that form at earlier times 
will indeed contain gas with a lower entropy floor. As- 
stiming the resulting entropy floor can be represented 
by the background entropy at the formation redshift, we 
find the entropy floor for clusters at z = 0.8 evolves to 
z = 0.05 by an increase of a factor of ~ 2, according to 
the calculation of Valageas & Silk (1999) for the AGN 
heating scenario (see their Figure 2; in the stellar heat- 
ing case, the evolution is much steeper, but it is unclear 
whether stars can provide the necessary amount of heat). 
This increase is comparable to our findings: ~ 70% when 
we assume no intrinsic scatter, and ^ 60% when we in- 
clude an intrinsic scatter. Of course, this comparison is 
based on a simple assumption, and the heating sources 
in (proto) clusters may be also different from the global 
average population. We leave a more serious comparison 
to future work. 

7. CONCLUSIONS 

There is ample evidence that non-gravitational pro- 
cesses, such as feedback from stars and BHs in galaxies, 
have injected excess entropy into the intracluster gas, and 
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therefore have modified its density profiles. While in the 
simplest scenario, the excess entropy is injected at high 
redshift, well before clusters actually form, and results in 
a universal entropy floor in galaxy clusters. A more re- 
alistic expectation is that the amount of excess entropy 
evolves with cosmic epoch, tracking on-going star and 
BH-formation. 

Here we studied a simple model of this preheating 
scenario, and found that it can simultaneously explain 
both global X-ray scaling relations and number counts 
of galaxy clusters. The level of entropy required between 
z = — 1 is ^ 200 — 300 keV cm^, corresponding to 
« 0.6 - 0.9[(1 + 5)/100]2/3[(l + z)/2]2 keV per particle if 
the energy is deposited in gas at overdensity S at redshift 
z. This overall level of enrichment is in agreement with 
previous studies. Here we find, additionally, evidence 
that the entropy floor evolves with redshift, increasing 
by about ~ 60% from z = O.S to z = 0.05. This frac- 
tional increase is in rough agreement with the evolution 
expected if the heating rate follows the global evolution 
of the AGN. The normalization as = 0.8 preferred when 
X-ray cluster number counts are fit with our model is 
somewhat higher than the best-fit value from the three- 
year WMAP data. For a flux-limited cluster catalog, 
we also flnd that including an intrinsic scatter in log- 
luminosity at both fixed temperature and at fixed mass 
does not have a big eflFect on the results. 
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